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Article history: In this study, we compare a cluster-based whale optimization algorithm (WOA) with 
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problem (TSP). The main goal is to reduce the time of solving a TSP. First, we solve 
the TSP with the Whale optimization algorithm, later we solve it with the combined 
method of solving TSP which uses the clustering method, called BIRCH (balanced 
iterative reducing and clustering using hierarchies). Birch builds a clustering feature 
Keywords: (CF) tree and then applies one of the clustering methods (for ex. K-means) to clus- 
ter data. Experiments performed on three datasets show that the convergence time 
improves by using the combined algorithm. 
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1. INTRODUCTION 

Meta-heuristics are methods for finding the search agents to approximate the exact solutions, and 
swarm-based algorithms use agents like Bees, Krills, and Whales. These algorithms are flexible so they can 
solve many problems like traveling salesman problem (TSP) much faster. Therefore researchers have found 
many meta-heuristics to solve difficult optimization problems. For the TSP, many advanced methods have been 
used to solve it in a shorter time. The cost function of TSP is minimized, which means finding the shortest 
path is the best answer. In our previous article (2022) [1], a meta-heuristic continuous optimization algorithm 
called the whale optimization algorithm (WOA) was combined with K-means to solve the TSP problem, but 
in this study, WOA is combined with the balanced iterative reducing and clustering using hierarchies (BIRCH) 
algorithm to find the shortest path. This new TSP solver is called WOA-BIRCH. In TSP, a traveling salesman 
wishes to visit exactly once each of a list of m cities (where the cost of traveling from city i to city j is C;;) and 
then returns to the home city with minimum cost [2]. 

The metaheuristic algorithms have different characteristics because of their inspirations from natural 
or special biological behaviors. These algorithms need comprehensive tests from the benchmark functions to 
evaluate their performance. The WOA algorithm has shown good results compared with other meta-heuristics 
so it is being used in different fields of engineering such as [3]: optimizing the placement of capacitors, Mak- 
ing feature selection techniques, solving the economic dispatch problem, enhancing the performance of pho- 
tovoltaic power systems, efficiently balancing energy production with the load demand [3], finding proper 
coefficients, cost minimization, and feature selection based on whale optimization algorithm (FSWOA) 
with the aim to reduce the dimensionality of medical data [5] by the selection of a reduced feature set. 
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In this study, after some explanation about the BIRCH algorithm, once the problem solves with WOA, 
then we recommend the combined BIRCH algorithm as another solution for the same problem. Within the 
whale algorithm, we have three mathematical phases: encircling prey, spiral bubble-net feeding, and searching 
for prey [6]. In this algorithm, agents have a unique method to hunt which is called bubble-net feeding 
as we have explained in our previous research. In section 2, some theories about the BIRCH algorithm, and 
in section 3 the research method with the pseudocode and some equations have been provided. In section 4 
Experimental tests and in section 5 Results have been discussed in the form of some tables and figures, and in 
section 6, The research is finalized. 


2. COMPREHENSIVE THEORIES 

In this section, we want to explain some theories about the BIRCH algorithm with some of the equa- 
tions which help us to understand the steps of the algorithm in the next sections. Clustering is a technique 
for grouping data into different subsets to find some information from data that were hidden. This method 
divides objects into subsets, based on the similarity(closeness) within the clusters and dissimilarity in the outer 
clusters. The issues with clustering algorithms can be summarized as scalability and the execution time for 
big data. Here, we want to explain the BIRCH algorithm which is an integrated hierarchical clustering 
algorithm. Birch is a multi-stage method in the field of hierarchical clustering techniques that solves two hi- 
erarchical clustering difficulties, which are the scalability and rollback problems. Hierarchical clustering (HC) 
tends to merge the nearest pairs or divides the farthest pairs we call them agglomerative HC or divisive HC 
approaches respectively. BIRCH defines as a hierarchy-based technique used in data mining for solving real- 
life problems. When we insert the data into BIRCH, a clustering feature (CF) will be generated, such as each 
node showing a cluster, intermediate nodes representing superclusters, and the leaf nodes demonstrating the 
present clusters. The CF values summarize information about subclusters instead of storing all points [9]. The 
Branching (Br) parameter indicates the maximum number of children. When a new cluster is built, if it ends 
up with a greater Branching factor, the parent will divide. 

The new point walks down recursively, always entering the subcluster which is the closest center 
until the walk reaches the leaf node. For having a balanced tree, the nodes split recursively. When all data 
are assigned, the leaf’s centers will enter another clustering algorithm like k-means. This step merges the 
neighboring clusters to improve them. The advantages of this algorithm are its high performance in terms of 
memory, execution time, quality of the clusters, stability, and scalability, parallel and concurrent clustering, 
and its interactively and dynamically tune performance; so we combine it with WOA to use it as a TSP solver. 
In fact, BIRCH handles an accumulated section collectively by storing a compact aa Oe The dense regions 
are actually called subclusters. The algorithm operates in the following four steps [10]: i) scans all data and 
builds CF tree (loading), ii) condensing the original tree by building a smaller CF tree, iii) applying a clustering 
algorithm for all leaf entries, and iv) refining the clusters format(optional). For N d-dimensional data [11], the 
clustering feature, CE, the cluster’s centroid, x0, radius, R, and diameter, D, are (12): 


CF = (n,LS,SS) (1) 
he xy LS 
ro = aE =— (2) 
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The WOA algorithm that we want to combine with our algorithm is summarized in the research 
method section in the form of pseudocode in Algorithm|[I]and Table[I] the parameters of the WOA algorithm 
are defined [13]. In these parameters, a which is a kind of distance parameter affects the ability of exploration 
and exploitation as a decreasing number from 2 to 0 and results in fast convergence. 
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Algorithm 1 Pseudocode of the whale optimization algorithm 
(1) Generate the initial population for Xi (i=1, 2, ..., n) 
(2) Compute the fitness of all the candidate solutions individually 
(3) While termination is not met, for i= 1:n do 
(4) Update a, A, C, I, and P when A = 20.7 — @ and 
C=27 
(5) for j=1:n do 
(6) If p < 0.5 , and |A| < 1, then 
(7) Compute: 
B= C.X*(t) — ¥(t), and X(t +1) = X*(t)- A.D 
(8) else if (|A] > 1) 
(9) Select a random individual (Xana ) 
(10) Compute 
oe |O-Xrana = | cae are eer Bs) 
(11) end if 
(12) else if P > 0.5 then 
(13) Compute 
x _ FZ! pb 4 
(¢+1) = D’.e".cos(2nl) + X*(t) 
(14) end if 
(15) end for 
(16) Check search agents, Compute the fitness for X;, and update X* , 
(17) end for 
(18) end while 


Table 1. Parameter definition 
Parameter Definition 
Decreasing number from 2 to 0 
CoefficientA= [-a, a] 
Random number[0,1] 
The whales position 
The best solution 
Random number [0, 1] 
A constant for logarithmic shape 
A value in [-1,1] 
Distance 


DH cUs mH De 


3. RESEARCH METHOD 

The partitioning technique is one of the foremost important parts of cluster analysis, and lots of algo- 
rithms are implemented for it like K-means, K-means++, and K-medoids [21]. K-means first selects K initial 
seeds, then assigns the set of points into K clusters by minimizing the sum of squared error (SSE) [22]. We 
apply K-means in phase 3 of the BIRCH pseudocode [23]. 


3.1. Pseudocode of BIRCH 

Here, the five phases of BIRCH are summarized: 1) Search all the data to make a tree as in (1). 2) 
Rebuild the CF tree with a bigger T (tree size). 3) Apply K-means or K-modes. 4) Cluster refining by making 
some additional passings over the data for the new reassignments based on the closeness to the centroids. 5) 
Repeat all the steps to create K number of clusters. 


3.2. Equations for the BIRCH algorithm 

In the previous equations, the compactness of the clusters defines by R, and D, such R is the 
average distance from member points to the centroid x0, and D is the average pairwise distance within a 
cluster, and CF entry is a triple, where N is the number of data points, LS (Linear Sum of N), and SS 
(square sum of N) defines our subclusters. Here is an example of how two CFs can be merged which is called 
the additivity theorem [25]: 


CF = (ni, LS1, $151) (5) 
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CF» = (nz, LS, 5'S2) (6) 


CF, + CF, = (ny + ne, LS; + LS, S'S) + S'S2) (7) 


if we have two centroids for two clusters called x0 and x1, one of the equations of distance measurement that 
can be used is Euclidean distance [26]: 


dij = Vi =a) +e yy)? (8) 


there exist some other distance measurements like Manhattan and city block. We have two measurements for 
quality called QI (mean of the radius), and Q2 (mean of the diameter). These quality measurements can be 
used for pre-processing [27]: 
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4. EXPERIMENTAL TESTS 

In Table[2|the values of our parameters, and Tables [3]to [6] the result of the experimental tests for our 
two combined algorithms have shown. We perform experiments on a MacBook with 8 GB of RAM, and 8 core 
CPU running MATLAB R2019a for both algorithms. This algorithm runs itself until the termination criteria 
are met. We can use some termination criteria, like a predefined number of iterations (here 20 iterations), 
sustainability of the results and time limit. The results obtained and shown in the tables are averaged over 20 
runs of the proposed model for our datasets which are Ali535, Rat783, and pr1002 from TSPLIB TESTDATA 
[28]. The execution time for both of the algorithms is calculated using a tic-toc command. 


5. RESULTS AND DISCUSSIONS 

Here, the steps of our algorithm start with setting the parameter. The other parameters like the branch- 
ing factor can be decided as (11), but these numbers are experimental. For the threshold, we have used the 
standard deviation of our data model which consists of x and y for the dimensions of our dataset. Again, this 
formula can change based on experience. The next equations are variance and standard deviation, but we have 
used (14) by experience. 
1: Set the parameters as shown in Table[2} 


Branching_Factor = af (11) 
¢. 1 . =. 2 

aoaln D(a ji) (12) 

cae (13) 

S = 5x oa(a(D)) (14) 


2: Specify K randomly or the below equation for big tours among the cities: 


N 
k= 55 (15) 


which N is the city’s length, or we can also divide it by 2 (the standard formula). We use the same technique 
For the Branching factor. 
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3: Applying the BIRCH algorithm to cluster our data. 

4: Applying the WOA algorithm for all of the found clusters i =1:K. 

5: Find the location of agents. 

6: Sorting by indexing. 

7: Joining the clusters by finding the cities that are closer to the centroid of that cluster. 

8: Repeat till joining all the clusters. 


Table[2|mentions some parameters consisting of the initial population, iteration number, and city num- 
bers. In the other tables like Table [3] and Table 4] some statistical values of fitness function (cost function) for 
unclustered and clustered WOA are mentioned. In Table [5]and Table [6] the execution time for the unclustered 
and clustered WOA are shown, respectively, and the same statistical calculations are considered for all three 
datasets. Based on the results, the fitness function is not changed much but based on the time average for Tables 
[5]and|6] the time has improved for the new method (WOA-BIRCH). 


Table 2. The parameter settings 


Number of initial population Iterations | Number of the cities 
100 20 535 
100 20 783 
100 20 1002 


Table 3. The fitness function for the unclustered whale optimization algorithm 


Dataset Min Max Average Stdev.s 
Ali535 ——-35333.53 36400.23 35903.55 46913.33161 
Rat783 —-1.2895e+05—s-1.3475e+05 —-1.3169e+05 1.494 1e+03 
Pr1002 = 3.9822e+06 = 4.1810e+06 4.0951le+06  46913.33 


Table 4. The fitness function for the clustered whale with BIRCH algorithm 


Dataset Min Max Average Stdev.s 
Ali535. —-35763.49 36702.4 36197.19 49571.8325 
Rat783 —-1.1233e+05—-1.1434e+05 —-1.1329e+05 = 642.4314 
Pr1002 = 3.2389e+06 = 3.4310e+06 = 3.3316e+06 = 49571.83 


Table 5. The execution time of the unclustered whale optimization algorithm 


Dataset Min Max Average _ Stdev.s 
Ali535. 8.95 9.13 9.03 0.04 
Rat783. 12.98 13.21 13.05 0.06 
Prl002. 16.19 16.50 =: 16.31 0.07 


Table 6. The execution time of the clustered whale optimization with BIRCH algorithm 


Dataset Min Max Average  Stdev.s 
AliS35 4.11 4.29 4.15 0.04 
Rat783. 5.15 5.26 5.18 0.02 
Pr1002. 6.46 §=6.56 6.48 0.07 


5.1. Figures 

Figures [Ifa) shows the best cost function of an unclustered approach for the ali535 dataset in 20 
iterations, and Figure [ifb) shows the connected cities in an unclustered approach for the same dataset. Figure 
ta) shows the cost function for the Rat783 dataset, and Figure [2{b) shows how these cities are connected. 
Figure}3(a) shows the cost function of an unclustered approach for the pr1002 datasets during the iterations, and 
Figure/3{b) shows all the cities which are connected through an unclustered approach for finding the best path. 
In Figure [4{a) the WOA-BIRCH is applied for clustering with the BIRCH algorithm for the Ali535 dataset, 
and in Figure [4[b) all the subgraphs (before connections) are ilustrated for the same dataset. In Figure ta) the 
WOA-BIRCH is applied for clustering with the BIRCH algorithm for the Rat783 dataset, and subgraphs are 
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shown in Figure [5{b). This figures is more complicated because of the number of cities (783). In Figure [6{a), 
the WOA-BIRCH is applied for clustering with the BIRCH algorithm for the Pr1002 dataset, and in Figure 
[6{b), we have an incomplete graph to show the connections before the clustering with BIRCH. 


Figure 1. Effects of selecting the WOA algorithm to solve TSP for ali535, 
(a) the values of the cost function for ali535 and (b) unclustered approach 


iteration 


(a) 


Figure 2. Effects of selecting the WOA algorithm to solve TSP for Rat783, 
(a) the values of the cost function for Rat783 and (b) unclustered approach 


Best Cost 


Figure 3. Effects of selecting the WOA algorithm to solve TSP for pr1002, 
(a) the values of cost function for pr1002 and (b) unclustered approach 


Int J Artif Intell, Vol. 12, No. 4, December 2023: 1619-1627 


Int J Artif Intell ISSN: 2252-8938 i) 1625 


(a) (b) 


Figure 4. Effects of selecting the combined BIRCH and WOA algorithm to solve TSP for ali535, 
(a) connected clusters and (b) the unjoined optimal clusters 


(a) (b) 


Figure 5. Effects of selecting the combined BIRCH and WOA algorithm to solve TSP for Rat783, 
(a) connected clusters and (b) the unjoined optimal clusters 
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Figure 6. Effects of selecting the combined BIRCH and WOA algorithm to solve TSP for pr1002, 
(a) connected clusters and (b) the unjoined optimal clusters 


6. CONCLUSION 


In this paper, our data is clustered by the BIRCH algorithm to speed up the execution time of the 
WOA algorithm to solve TSP. The observations show that the convergence time is improved. The idea was to 
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cluster our data by the BIRCH algorithm, then solve each part by WOA and finally connect those clusters to 
complete a tour. Based on the final results, the TSP solver WOA-BIRCH is faster than the whale algorithm for 
TSP, therefore we can conclude that applying the BIRCH algorithm has empowered our algorithm and made it 
more applicable. The comparison measurements certify the improvement of our algorithm. 
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